Page Layout Analysis System for Unconstrained Historic Documents
نویسندگان
چکیده
Extraction of text regions and individual lines from historic documents is necessary for automatic transcription. We propose extending a CNN-based baseline detection system by adding line height block boundary predictions to the model output, allowing extract more comprehensive layout information. also show that pixel-wise orientation prediction can be used processing with multiple orientations. demonstrate proposed method performs well on cBAD dataset. Additionally, we benchmark newly introduced PERO dataset which make public.
منابع مشابه
Page Layout Classification Technique for Biomedical Documents
The structural layout information of scanned document pages is valuable for a wide range of document processing applications such as automatic document searching, document delivery and automated data entry. This paper describes the classification of scanned document pages into different classes of physical layout structures. The page layout classification technique proposed in this paper uses a...
متن کاملAdaptive layout for interactive documents
In many application domains there is a strong need to produce content both for traditional print media and for interactive media. In order to fully benefit from digital devices, online documents must provide mechanisms to support interactivity and for the personalization of content. Thus, powerful authoring tools as well as flexible layout techniques are needed to display dynamic information ef...
متن کاملEye-tracking Analysis for Automatic Documents Eye-catching Layout Retrieval
In this paper we present a synthesis of experiments of eye movement pursuit that have been applied to documents structure retrieval. The aim of this work is to propose a representation of structured documents content (the physical layout) through the simulation of a possible human inspired scan path. The research project which is presented here is based on the hypotheses that the analysis and t...
متن کاملIJEL 4/1 page layout
Instructional text, and procedural text in particular, is a genre that users heavily rely upon when they are learning new procedures, devices or systems. It is, however, also well-known to be a genre that is difficult to produce and maintain. This article discusses Isolde, an environment that attempts to address this problem by supporting the semi-automated production of procedural instructions...
متن کاملIJEL 4/3 page layout
Studies have shown that when learning occurs in an environment that uses animated pedagogical agents and personalized instruction, the learner learns the material more deeply and can recall it easier when compared to learning without an agent. Thus, an effective learning system creates personalized contexts for each learner. The one size fits all concept is not very effective across a large n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-86331-9_32